Predicting Molecule Toxicity via Descriptor-based Graph Self-supervised Learning
نویسندگان
چکیده
Predicting molecular properties with Graph Neural Networks (GNNs) has recently drawn a lot of attention, compound toxicity prediction being one the biggest challenges. In cases where there is insufficient labeled molecule data, an effective approach to pre-train GNNs on large-scale unlabeled data and then fine-tune them for downstream tasks. Among pre-training strategies, node-level involves masking predicting atom properties, while motif-based methods capture rich information in subgraphs. These approaches have shown effectiveness across various However, current frameworks face two main challenges: (1) auxiliary tasks do not preserve useful domain knowledge, (2) fusion computationally extensive. To address these challenges, we propose Descriptor-based Self-supervised Learning (DGSSL), method that utilizes knowledge enhance graph representation learning. Specifically, it identifies descriptor centers molecules encodes motif-like as special atomic numbers This enables self-supervised also local Experimental results demonstrate our achieves state-of-the-art performance three toxicity-related benchmarks.
منابع مشابه
Interpretable Graph-Based Semi-Supervised Learning via Flows
In this paper, we consider the interpretability of the foundational Laplacian-based semi-supervised learning approaches on graphs. We introduce a novel flow-based learning framework that subsumes the foundational approaches and additionally provides a detailed, transparent, and easily understood expression of the learning process in terms of graph flows. As a result, one can visualize and inter...
متن کاملGraph-Based Semi-Supervised Learning
While labeled data is expensive to prepare, ever increasing amounts of unlabeled data is becoming widely available. In order to adapt to this phenomenon, several semi-supervised learning (SSL) algorithms, which learn from labeled as well as unlabeled data, have been developed. In a separate line of work, researchers have started to realize that graphs provide a natural way to represent data in ...
متن کاملReblur2Deblur: Deblurring Videos via Self-Supervised Learning
Motion blur is a fundamental problem in computer vision as it impacts image quality and hinders inference. Traditional deblurring algorithms leverage the physics of the image formation model and use hand-crafted priors: they usually produce results that better reflect the underlying scene, but present artifacts. Recent learning-based methods implicitly extract the distribution of natural images...
متن کاملParallel Graph-Based Semi-Supervised Learning
Semi-supervised learning (SSL) is the process of training decision functions using small amounts of labeled and relatively large amounts of unlabeled data. In many applications, annotating training data is time-consuming and error prone. Speech recognition is the typical example, which requires large amounts of meticulously annotated speech data (Evermann et al., 2005) to produce an accurate sy...
متن کاملpkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures
Drug development has a high attrition rate, with poor pharmacokinetic and safety properties a significant hurdle. Computational approaches may help minimize these risks. We have developed a novel approach (pkCSM) which uses graph-based signatures to develop predictive models of central ADMET properties for drug development. pkCSM performs as well or better than current methods. A freely accessi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2023
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2023.3308203